Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 60.015 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 6.0 MiB |
| Average record size in memory | 104.0 B |
Variable types
| Numeric | 10 |
|---|---|
| DateTime | 1 |
| Categorical | 2 |
Brændolie forbrug is highly correlated with Produktion til byen and 1 other fields | High correlation |
Produktion til byen is highly correlated with Brændolie forbrug and 1 other fields | High correlation |
Gram pr. kWh is highly correlated with Elvirkningsgrad | High correlation |
Elvirkningsgrad is highly correlated with Gram pr. kWh | High correlation |
Produktion total is highly correlated with Brændolie forbrug and 1 other fields | High correlation |
City is highly correlated with Distrikt | High correlation |
Distrikt is highly correlated with City | High correlation |
df_index has unique values | Unique |
Reproduction
| Analysis started | 2021-03-08 21:32:56.050964 |
|---|---|
| Analysis finished | 2021-03-08 21:33:05.827303 |
| Duration | 9.78 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 60015 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50409.34993 |
|---|---|
| Minimum | 1 |
| Maximum | 103138 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3964.7 |
| Q1 | 26141.5 |
| median | 50913 |
| Q3 | 71927.5 |
| 95-th percentile | 98324.3 |
| Maximum | 103138 |
| Range | 103137 |
| Interquartile range (IQR) | 45786 |
Descriptive statistics
| Standard deviation | 29187.76089 |
|---|---|
| Coefficient of variation (CV) | 0.5790148243 |
| Kurtosis | -1.006423527 |
| Mean | 50409.34993 |
| Median Absolute Deviation (MAD) | 22749 |
| Skewness | 0.1417743328 |
| Sum | 3025317136 |
| Variance | 851925386 |
| Monotocity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2049 | 1 | < 0.1% |
| 56689 | 1 | < 0.1% |
| 79180 | 1 | < 0.1% |
| 101723 | 1 | < 0.1% |
| 21856 | 1 | < 0.1% |
| 17762 | 1 | < 0.1% |
| 19811 | 1 | < 0.1% |
| 30052 | 1 | < 0.1% |
| 32101 | 1 | < 0.1% |
| 7529 | 1 | < 0.1% |
| Other values (60005) | 60005 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 |
| Value | Count | Frequency (%) |
| 103138 | 1 | |
| 103137 | 1 | |
| 103136 | 1 | |
| 103135 | 1 | |
| 103134 | 1 |
| Distinct | 1511 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 386.2192785 |
|---|---|
| Minimum | 37 |
| Maximum | 2118 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 37 |
|---|---|
| 5-th percentile | 115 |
| Q1 | 195 |
| median | 272 |
| Q3 | 486 |
| 95-th percentile | 1030 |
| Maximum | 2118 |
| Range | 2081 |
| Interquartile range (IQR) | 291 |
Descriptive statistics
| Standard deviation | 289.1996441 |
|---|---|
| Coefficient of variation (CV) | 0.7487965004 |
| Kurtosis | 2.082280573 |
| Mean | 386.2192785 |
| Median Absolute Deviation (MAD) | 113 |
| Skewness | 1.584643528 |
| Sum | 23178950 |
| Variance | 83636.43417 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 213 | 260 | 0.4% |
| 205 | 257 | 0.4% |
| 211 | 250 | 0.4% |
| 207 | 247 | 0.4% |
| 210 | 245 | 0.4% |
| 222 | 245 | 0.4% |
| 216 | 243 | 0.4% |
| 204 | 242 | 0.4% |
| 215 | 240 | 0.4% |
| 223 | 240 | 0.4% |
| Other values (1501) | 57546 |
| Value | Count | Frequency (%) |
| 37 | 1 | < 0.1% |
| 43 | 2 | |
| 48 | 1 | < 0.1% |
| 49 | 3 | |
| 54 | 2 |
| Value | Count | Frequency (%) |
| 2118 | 1 | |
| 2080 | 1 | |
| 2021 | 1 | |
| 1971 | 1 | |
| 1889 | 2 |
| Distinct | 4808 |
|---|---|
| Distinct (%) | 8.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1215.517471 |
|---|---|
| Minimum | 45 |
| Maximum | 7201 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 45 |
|---|---|
| 5-th percentile | 221 |
| Q1 | 483 |
| median | 786 |
| Q3 | 1605 |
| 95-th percentile | 3693 |
| Maximum | 7201 |
| Range | 7156 |
| Interquartile range (IQR) | 1122 |
Descriptive statistics
| Standard deviation | 1074.743537 |
|---|---|
| Coefficient of variation (CV) | 0.8841860054 |
| Kurtosis | 1.834940315 |
| Mean | 1215.517471 |
| Median Absolute Deviation (MAD) | 421 |
| Skewness | 1.555432012 |
| Sum | 72949281 |
| Variance | 1155073.67 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 591 | 78 | 0.1% |
| 569 | 75 | 0.1% |
| 560 | 75 | 0.1% |
| 596 | 75 | 0.1% |
| 648 | 70 | 0.1% |
| 558 | 68 | 0.1% |
| 609 | 67 | 0.1% |
| 590 | 66 | 0.1% |
| 612 | 66 | 0.1% |
| 611 | 66 | 0.1% |
| Other values (4798) | 59309 |
| Value | Count | Frequency (%) |
| 45 | 1 | |
| 115 | 1 | |
| 117 | 1 | |
| 119 | 2 | |
| 121 | 2 |
| Value | Count | Frequency (%) |
| 7201 | 1 | |
| 6732 | 1 | |
| 6627 | 1 | |
| 6029 | 1 | |
| 5993 | 1 |
| Distinct | 39791 |
|---|---|
| Distinct (%) | 66.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 265.2053872 |
|---|---|
| Minimum | 129.2684 |
| Maximum | 799.2819 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 129.2684 |
|---|---|
| 5-th percentile | 205.28301 |
| Q1 | 239.6253 |
| median | 260.7339 |
| Q3 | 286.3636 |
| 95-th percentile | 334.8252 |
| Maximum | 799.2819 |
| Range | 670.0135 |
| Interquartile range (IQR) | 46.7383 |
Descriptive statistics
| Standard deviation | 47.49739073 |
|---|---|
| Coefficient of variation (CV) | 0.1790966285 |
| Kurtosis | 17.80997664 |
| Mean | 265.2053872 |
| Median Absolute Deviation (MAD) | 22.7992 |
| Skewness | 2.205562428 |
| Sum | 15916301.32 |
| Variance | 2256.002126 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 280 | 230 | 0.4% |
| 258.4615 | 65 | 0.1% |
| 315 | 56 | 0.1% |
| 240 | 52 | 0.1% |
| 252 | 46 | 0.1% |
| 265.2632 | 46 | 0.1% |
| 336 | 44 | 0.1% |
| 262.5 | 41 | 0.1% |
| 305.4546 | 38 | 0.1% |
| 296.4706 | 38 | 0.1% |
| Other values (39781) | 59359 |
| Value | Count | Frequency (%) |
| 129.2684 | 1 | |
| 129.4886 | 1 | |
| 129.5394 | 1 | |
| 129.5745 | 1 | |
| 129.6035 | 1 |
| Value | Count | Frequency (%) |
| 799.2819 | 1 | |
| 795.7895 | 1 | |
| 792.248 | 1 | |
| 792.1752 | 1 | |
| 787.0114 | 1 |
| Distinct | 38859 |
|---|---|
| Distinct (%) | 64.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.69959493 |
|---|---|
| Minimum | 10.541 |
| Maximum | 64.9811 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 10.541 |
|---|---|
| 5-th percentile | 25.16049 |
| Q1 | 29.4185 |
| median | 32.3101 |
| Q3 | 35.15265 |
| 95-th percentile | 41.0318 |
| Maximum | 64.9811 |
| Range | 54.4401 |
| Interquartile range (IQR) | 5.73415 |
Descriptive statistics
| Standard deviation | 5.734713307 |
|---|---|
| Coefficient of variation (CV) | 0.1753756681 |
| Kurtosis | 5.819061801 |
| Mean | 32.69959493 |
| Median Absolute Deviation (MAD) | 2.8673 |
| Skewness | 1.391859793 |
| Sum | 1962466.19 |
| Variance | 32.88693672 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 30.09 | 221 | 0.4% |
| 32.5975 | 63 | 0.1% |
| 26.7467 | 54 | 0.1% |
| 35.105 | 46 | 0.1% |
| 31.7617 | 44 | 0.1% |
| 33.4333 | 43 | 0.1% |
| 25.075 | 42 | 0.1% |
| 32.096 | 40 | 0.1% |
| 28.4183 | 37 | 0.1% |
| 32.3189 | 36 | 0.1% |
| Other values (38849) | 59389 |
| Value | Count | Frequency (%) |
| 10.541 | 1 | |
| 10.5872 | 1 | |
| 10.6345 | 1 | |
| 10.6355 | 1 | |
| 10.7053 | 1 |
| Value | Count | Frequency (%) |
| 64.9811 | 1 | |
| 64.967 | 1 | |
| 64.9442 | 1 | |
| 64.9311 | 1 | |
| 64.9138 | 1 |
| Distinct | 4864 |
|---|---|
| Distinct (%) | 8.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1305.944947 |
|---|---|
| Minimum | 169 |
| Maximum | 7600 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 169 |
|---|---|
| 5-th percentile | 319 |
| Q1 | 589 |
| median | 861 |
| Q3 | 1686 |
| 95-th percentile | 3814.3 |
| Maximum | 7600 |
| Range | 7431 |
| Interquartile range (IQR) | 1097 |
Descriptive statistics
| Standard deviation | 1077.217272 |
|---|---|
| Coefficient of variation (CV) | 0.8248565716 |
| Kurtosis | 2.09494938 |
| Mean | 1305.944947 |
| Median Absolute Deviation (MAD) | 390 |
| Skewness | 1.621168806 |
| Sum | 78376286 |
| Variance | 1160397.051 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 684 | 79 | 0.1% |
| 595 | 78 | 0.1% |
| 598 | 78 | 0.1% |
| 633 | 77 | 0.1% |
| 601 | 77 | 0.1% |
| 638 | 76 | 0.1% |
| 663 | 75 | 0.1% |
| 771 | 75 | 0.1% |
| 607 | 74 | 0.1% |
| 655 | 74 | 0.1% |
| Other values (4854) | 59252 |
| Value | Count | Frequency (%) |
| 169 | 1 | |
| 181 | 1 | |
| 183 | 1 | |
| 184 | 1 | |
| 185 | 1 |
| Value | Count | Frequency (%) |
| 7600 | 1 | |
| 6907 | 1 | |
| 6784 | 1 | |
| 6565 | 1 | |
| 6098 | 1 |
Vejlys
Real number (ℝ≥0)
| Distinct | 147 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.53091727 |
|---|---|
| Minimum | 1 |
| Maximum | 377 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 17 |
| Q3 | 32 |
| 95-th percentile | 64 |
| Maximum | 377 |
| Range | 376 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 21.23404264 |
|---|---|
| Coefficient of variation (CV) | 0.9424402203 |
| Kurtosis | 4.855142641 |
| Mean | 22.53091727 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 1.705729542 |
| Sum | 1352193 |
| Variance | 450.8845667 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 3928 | 6.5% |
| 2 | 3327 | 5.5% |
| 6 | 2151 | 3.6% |
| 7 | 1938 | 3.2% |
| 8 | 1929 | 3.2% |
| 3 | 1910 | 3.2% |
| 4 | 1905 | 3.2% |
| 5 | 1816 | 3.0% |
| 9 | 1705 | 2.8% |
| 11 | 1518 | 2.5% |
| Other values (137) | 37888 |
| Value | Count | Frequency (%) |
| 1 | 3928 | |
| 2 | 3327 | |
| 3 | 1910 | |
| 4 | 1905 | |
| 5 | 1816 |
| Value | Count | Frequency (%) |
| 377 | 1 | |
| 245 | 1 | |
| 227 | 1 | |
| 209 | 1 | |
| 204 | 1 |
Eget forbrug
Real number (ℝ≥0)
| Distinct | 377 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 44.66428393 |
|---|---|
| Minimum | 5 |
| Maximum | 1177 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 12 |
| Q1 | 22 |
| median | 30 |
| Q3 | 46 |
| 95-th percentile | 136 |
| Maximum | 1177 |
| Range | 1172 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 45.98516737 |
|---|---|
| Coefficient of variation (CV) | 1.029573595 |
| Kurtosis | 24.18175845 |
| Mean | 44.66428393 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 3.755468407 |
| Sum | 2680527 |
| Variance | 2114.635618 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 25 | 2009 | 3.3% |
| 26 | 1980 | 3.3% |
| 28 | 1960 | 3.3% |
| 24 | 1948 | 3.2% |
| 23 | 1935 | 3.2% |
| 27 | 1829 | 3.0% |
| 22 | 1821 | 3.0% |
| 29 | 1744 | 2.9% |
| 21 | 1701 | 2.8% |
| 16 | 1637 | 2.7% |
| Other values (367) | 41451 |
| Value | Count | Frequency (%) |
| 5 | 136 | 0.2% |
| 6 | 577 | |
| 7 | 1102 | |
| 8 | 697 | |
| 9 | 110 | 0.2% |
| Value | Count | Frequency (%) |
| 1177 | 1 | |
| 1005 | 1 | |
| 731 | 1 | |
| 689 | 1 | |
| 590 | 1 |
Date
Date
| Distinct | 4076 |
|---|---|
| Distinct (%) | 6.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 469.0 KiB |
| Minimum | 2008-11-28 00:00:00 |
|---|---|
| Maximum | 2020-11-30 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 36 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 469.0 KiB |
| 024 Qassimiut | 3901 |
|---|---|
| 021 Saarloq | 3685 |
| 013 Narsarmijit | 3662 |
| 072 Napasoq | 3417 |
| 035 Qassiarsuk | 3232 |
| Other values (31) |
Length
| Max length | 21 |
|---|---|
| Median length | 13 |
| Mean length | 12.96940765 |
| Min length | 8 |
Characters and Unicode
| Total characters | 778359 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 092 Attu |
|---|---|
| 2nd row | 092 Attu |
| 3rd row | 092 Attu |
| 4th row | 092 Attu |
| 5th row | 092 Attu |
| Value | Count | Frequency (%) |
| 024 Qassimiut | 3901 | 6.5% |
| 021 Saarloq | 3685 | 6.1% |
| 013 Narsarmijit | 3662 | 6.1% |
| 072 Napasoq | 3417 | 5.7% |
| 035 Qassiarsuk | 3232 | 5.4% |
| 092 Attu | 3166 | 5.3% |
| 124 Ilimanaq | 2928 | 4.9% |
| 095 Iginniarfik | 2888 | 4.8% |
| 169 Innarsuit | 2718 | 4.5% |
| 182 Sermiligaaq | 2662 | 4.4% |
| Other values (26) | 27756 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| qassimiut | 3901 | 3.2% |
| 024 | 3901 | 3.2% |
| saarloq | 3685 | 3.1% |
| 021 | 3685 | 3.1% |
| narsarmijit | 3662 | 3.0% |
| 013 | 3662 | 3.0% |
| 072 | 3417 | 2.8% |
| napasoq | 3417 | 2.8% |
| qassiarsuk | 3232 | 2.7% |
| 035 | 3232 | 2.7% |
| Other values (63) | 84520 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 91725 | 11.8% |
| 60299 | 7.7% | |
| i | 60169 | 7.7% |
| s | 58533 | 7.5% |
| r | 42433 | 5.5% |
| 1 | 41496 | 5.3% |
| 0 | 37819 | 4.9% |
| u | 36671 | 4.7% |
| t | 31192 | 4.0% |
| q | 25753 | 3.3% |
| Other values (28) | 292269 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 477716 | |
| Decimal Number | 180045 | 23.1% |
| Space Separator | 60299 | 7.7% |
| Uppercase Letter | 60299 | 7.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 91725 | |
| i | 60169 | |
| s | 58533 | |
| r | 42433 | |
| u | 36671 | 7.7% |
| t | 31192 | 6.5% |
| q | 25753 | 5.4% |
| k | 23298 | 4.9% |
| n | 22661 | 4.7% |
| l | 18880 | 4.0% |
| Other values (8) | 66401 |
| Value | Count | Frequency (%) |
| 1 | 41496 | |
| 0 | 37819 | |
| 2 | 22124 | |
| 5 | 17171 | |
| 3 | 13293 | 7.4% |
| 6 | 12183 | 6.8% |
| 9 | 11369 | 6.3% |
| 8 | 8565 | 4.8% |
| 4 | 8215 | 4.6% |
| 7 | 7810 | 4.3% |
| Value | Count | Frequency (%) |
| I | 14906 | |
| N | 9725 | |
| A | 9228 | |
| S | 8636 | |
| Q | 8430 | |
| K | 6536 | |
| T | 2195 | 3.6% |
| U | 359 | 0.6% |
| P | 284 | 0.5% |
| Value | Count | Frequency (%) |
| 60299 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 538015 | |
| Common | 240344 |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 91725 | |
| i | 60169 | |
| s | 58533 | |
| r | 42433 | 7.9% |
| u | 36671 | 6.8% |
| t | 31192 | 5.8% |
| q | 25753 | 4.8% |
| k | 23298 | 4.3% |
| n | 22661 | 4.2% |
| l | 18880 | 3.5% |
| Other values (17) | 126700 |
| Value | Count | Frequency (%) |
| 60299 | ||
| 1 | 41496 | |
| 0 | 37819 | |
| 2 | 22124 | 9.2% |
| 5 | 17171 | 7.1% |
| 3 | 13293 | 5.5% |
| 6 | 12183 | 5.1% |
| 9 | 11369 | 4.7% |
| 8 | 8565 | 3.6% |
| 4 | 8215 | 3.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 778359 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 91725 | 11.8% |
| 60299 | 7.7% | |
| i | 60169 | 7.7% |
| s | 58533 | 7.5% |
| r | 42433 | 5.5% |
| 1 | 41496 | 5.3% |
| 0 | 37819 | 4.9% |
| u | 36671 | 4.7% |
| t | 31192 | 4.0% |
| q | 25753 | 3.3% |
| Other values (28) | 292269 |
City_code
Real number (ℝ≥0)
| Distinct | 36 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 94.6574523 |
|---|---|
| Minimum | 13 |
| Maximum | 186 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 13 |
| Q1 | 35 |
| median | 95 |
| Q3 | 153 |
| 95-th percentile | 182 |
| Maximum | 186 |
| Range | 173 |
| Interquartile range (IQR) | 118 |
Descriptive statistics
| Standard deviation | 58.1105487 |
|---|---|
| Coefficient of variation (CV) | 0.6139035785 |
| Kurtosis | -1.396710239 |
| Mean | 94.6574523 |
| Median Absolute Deviation (MAD) | 60 |
| Skewness | 0.09876336523 |
| Sum | 5680867 |
| Variance | 3376.83587 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=36)
| Value | Count | Frequency (%) |
| 24 | 3901 | 6.5% |
| 21 | 3685 | 6.1% |
| 13 | 3662 | 6.1% |
| 72 | 3417 | 5.7% |
| 35 | 3232 | 5.4% |
| 92 | 3166 | 5.3% |
| 124 | 2928 | 4.9% |
| 95 | 2888 | 4.8% |
| 169 | 2718 | 4.5% |
| 182 | 2662 | 4.4% |
| Other values (26) | 27756 |
| Value | Count | Frequency (%) |
| 13 | 3662 | |
| 16 | 1020 | 1.7% |
| 18 | 284 | 0.5% |
| 21 | 3685 | |
| 24 | 3901 |
| Value | Count | Frequency (%) |
| 186 | 682 | 1.1% |
| 184 | 393 | 0.7% |
| 183 | 1115 | |
| 182 | 2662 | |
| 171 | 1215 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 469.0 KiB |
| 005 Kujalleq | |
|---|---|
| 009 Avannaa | |
| 008 Disko | |
| 007 Qeqqata | |
| 004 Ilulisat |
Length
| Max length | 12 |
|---|---|
| Median length | 11 |
| Mean length | 10.89111056 |
| Min length | 8 |
Characters and Unicode
| Total characters | 653630 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 008 Disko |
|---|---|
| 2nd row | 008 Disko |
| 3rd row | 008 Disko |
| 4th row | 008 Disko |
| 5th row | 008 Disko |
| Value | Count | Frequency (%) |
| 005 Kujalleq | 19372 | |
| 009 Avannaa | 14090 | |
| 008 Disko | 11081 | |
| 007 Qeqqata | 9885 | |
| 004 Ilulisat | 3254 | 5.4% |
| 006 Nuuk | 2333 | 3.9% |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| kujalleq | 19372 | |
| 005 | 19372 | |
| 009 | 14090 | |
| avannaa | 14090 | |
| disko | 11081 | |
| 008 | 11081 | |
| qeqqata | 9885 | |
| 007 | 9885 | |
| ilulisat | 3254 | 2.7% |
| 004 | 3254 | 2.7% |
| Other values (2) | 4666 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 120030 | |
| a | 84666 | |
| 60015 | 9.2% | |
| l | 45252 | 6.9% |
| q | 39142 | 6.0% |
| e | 29257 | 4.5% |
| n | 28180 | 4.3% |
| u | 27292 | 4.2% |
| 5 | 19372 | 3.0% |
| K | 19372 | 3.0% |
| Other values (17) | 181052 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 353555 | |
| Decimal Number | 180045 | |
| Space Separator | 60015 | 9.2% |
| Uppercase Letter | 60015 | 9.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 84666 | |
| l | 45252 | |
| q | 39142 | |
| e | 29257 | 8.3% |
| n | 28180 | 8.0% |
| u | 27292 | 7.7% |
| j | 19372 | 5.5% |
| i | 14335 | 4.1% |
| s | 14335 | 4.1% |
| v | 14090 | 4.0% |
| Other values (3) | 37634 |
| Value | Count | Frequency (%) |
| 0 | 120030 | |
| 5 | 19372 | 10.8% |
| 9 | 14090 | 7.8% |
| 8 | 11081 | 6.2% |
| 7 | 9885 | 5.5% |
| 4 | 3254 | 1.8% |
| 6 | 2333 | 1.3% |
| Value | Count | Frequency (%) |
| K | 19372 | |
| A | 14090 | |
| D | 11081 | |
| Q | 9885 | |
| I | 3254 | 5.4% |
| N | 2333 | 3.9% |
| Value | Count | Frequency (%) |
| 60015 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 413570 | |
| Common | 240060 |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 84666 | |
| l | 45252 | |
| q | 39142 | |
| e | 29257 | 7.1% |
| n | 28180 | 6.8% |
| u | 27292 | 6.6% |
| K | 19372 | 4.7% |
| j | 19372 | 4.7% |
| i | 14335 | 3.5% |
| s | 14335 | 3.5% |
| Other values (9) | 92367 |
| Value | Count | Frequency (%) |
| 0 | 120030 | |
| 60015 | ||
| 5 | 19372 | 8.1% |
| 9 | 14090 | 5.9% |
| 8 | 11081 | 4.6% |
| 7 | 9885 | 4.1% |
| 4 | 3254 | 1.4% |
| 6 | 2333 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 653630 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 120030 | |
| a | 84666 | |
| 60015 | 9.2% | |
| l | 45252 | 6.9% |
| q | 39142 | 6.0% |
| e | 29257 | 4.5% |
| n | 28180 | 4.3% |
| u | 27292 | 4.2% |
| 5 | 19372 | 3.0% |
| K | 19372 | 3.0% |
| Other values (17) | 181052 |
pop
Real number (ℝ≥0)
| Distinct | 157 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 116.8943931 |
|---|---|
| Minimum | 0 |
| Maximum | 459 |
| Zeros | 492 |
| Zeros (%) | 0.8% |
| Memory size | 469.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 53 |
| median | 83 |
| Q3 | 184 |
| 95-th percentile | 262 |
| Maximum | 459 |
| Range | 459 |
| Interquartile range (IQR) | 131 |
Descriptive statistics
| Standard deviation | 97.34926419 |
|---|---|
| Coefficient of variation (CV) | 0.8327966949 |
| Kurtosis | 2.694094307 |
| Mean | 116.8943931 |
| Median Absolute Deviation (MAD) | 44 |
| Skewness | 1.587817832 |
| Sum | 7015417 |
| Variance | 9476.879239 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 83 | 1625 | 2.7% |
| 52 | 1278 | 2.1% |
| 80 | 1268 | 2.1% |
| 202 | 1103 | 1.8% |
| 24 | 1089 | 1.8% |
| 85 | 1081 | 1.8% |
| 81 | 1024 | 1.7% |
| 53 | 1019 | 1.7% |
| 73 | 1004 | 1.7% |
| 206 | 987 | 1.6% |
| Other values (147) | 48537 |
| Value | Count | Frequency (%) |
| 0 | 492 | |
| 2 | 447 | |
| 4 | 37 | 0.1% |
| 5 | 505 | |
| 6 | 559 |
| Value | Count | Frequency (%) |
| 459 | 215 | |
| 457 | 246 | |
| 456 | 230 | |
| 453 | 414 | |
| 448 | 250 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | Brændolie forbrug | Produktion til byen | Gram pr. kWh | Elvirkningsgrad | Produktion total | Vejlys | Eget forbrug | Date | City | City_code | Distrikt | pop | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 389.0 | 1442.0 | 213.5686 | 39.4496 | 1530.0 | 9.0 | 74.0 | 2013-05-01 | 092 Attu | 92 | 008 Disko | 231 |
| 1 | 2 | 316.0 | 1486.0 | 167.0485 | 50.4357 | 1589.0 | 9.0 | 89.0 | 2013-05-02 | 092 Attu | 92 | 008 Disko | 231 |
| 2 | 3 | 535.0 | 1588.0 | 267.8188 | 31.4586 | 1678.0 | 8.0 | 75.0 | 2013-05-03 | 092 Attu | 92 | 008 Disko | 231 |
| 3 | 4 | 401.0 | 1422.0 | 222.7778 | 37.8189 | 1512.0 | 6.0 | 79.0 | 2013-05-04 | 092 Attu | 92 | 008 Disko | 231 |
| 4 | 5 | 291.0 | 1369.0 | 170.2228 | 49.4951 | 1436.0 | 6.0 | 55.0 | 2013-05-05 | 092 Attu | 92 | 008 Disko | 231 |
| 5 | 6 | 455.0 | 1426.0 | 258.7678 | 32.5589 | 1477.0 | 7.0 | 39.0 | 2013-05-06 | 092 Attu | 92 | 008 Disko | 231 |
| 6 | 7 | 431.0 | 1482.0 | 232.9730 | 36.1638 | 1554.0 | 6.0 | 60.0 | 2013-05-07 | 092 Attu | 92 | 008 Disko | 231 |
| 7 | 8 | 392.0 | 1379.0 | 225.5342 | 37.3566 | 1460.0 | 7.0 | 69.0 | 2013-05-08 | 092 Attu | 92 | 008 Disko | 231 |
| 8 | 9 | 359.0 | 1268.0 | 226.0570 | 37.2703 | 1334.0 | 5.0 | 56.0 | 2013-05-09 | 092 Attu | 92 | 008 Disko | 231 |
| 9 | 10 | 391.0 | 1387.0 | 223.7330 | 37.6574 | 1468.0 | 5.0 | 69.0 | 2013-05-10 | 092 Attu | 92 | 008 Disko | 231 |
Last rows
| df_index | Brændolie forbrug | Produktion til byen | Gram pr. kWh | Elvirkningsgrad | Produktion total | Vejlys | Eget forbrug | Date | City | City_code | Distrikt | pop | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 60005 | 103129 | 130.0 | 345.0 | 282.9016 | 29.7814 | 386.0 | 8.0 | 32.0 | 2013-08-22 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60006 | 103130 | 129.0 | 351.0 | 278.5604 | 30.2455 | 389.0 | 9.0 | 31.0 | 2013-08-23 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60007 | 103131 | 137.0 | 347.0 | 298.9091 | 28.1865 | 385.0 | 9.0 | 28.0 | 2013-08-24 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60008 | 103132 | 127.0 | 350.0 | 271.4504 | 31.0377 | 393.0 | 10.0 | 35.0 | 2013-08-25 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60009 | 103133 | 134.0 | 362.0 | 279.3052 | 30.1649 | 403.0 | 9.0 | 31.0 | 2013-08-26 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60010 | 103134 | 130.0 | 324.0 | 300.0000 | 28.0840 | 364.0 | 10.0 | 30.0 | 2013-08-27 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60011 | 103135 | 145.0 | 356.0 | 310.7143 | 27.1156 | 392.0 | 9.0 | 28.0 | 2013-08-28 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60012 | 103136 | 136.0 | 373.0 | 267.5410 | 31.4912 | 427.0 | 10.0 | 42.0 | 2013-08-29 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60013 | 103137 | 135.0 | 367.0 | 277.9412 | 30.3129 | 408.0 | 11.0 | 31.0 | 2013-08-30 | 095 Iginniarfik | 95 | 008 Disko | 82 |
| 60014 | 103138 | 131.0 | 339.0 | 290.3430 | 29.0181 | 379.0 | 10.0 | 30.0 | 2013-08-31 | 095 Iginniarfik | 95 | 008 Disko | 82 |